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Abstract Modelling the process of recombination leads to a large coupled nonlinear dynamical 
system. Here, we consider a particular case of recombination in discrete time, allowing only for single 
crossovers. While the analogous dynamics in continuous time admits a closed solution [3|, this no 
longer works for discrete time. A more general model (i.e. without the restriction to single crossovers) 
has been studied before 0, and was solved algorithmically by means of Haldane linearisation. 
Using the special formalism introduced in [sl, we obtain further insight into the single-crossover 
dynamics and the particular difficulties that arise in discrete time. We then transform the equations 
to a solvable system in a two-step procedure: linearisation followed by diagonalisation. Still, the 
coefficients of the second step must be determined in a recursive manner, but once this is done for 
a given system, they allow for an explicit solution valid for all times. 

Keywords population genetics ■ recombination dynamics • Mobius linearisation ■ diagonalisation ■ 
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Mathematics Subject Classification (2000) 92D10 • 37N30 • 06A07 ■ 60J05 



1 Introduction 

The dynamics of the genetic composition of populations evolving under recombination has been a 
long-standing subject of research. The traditional models assume random mating, non-overlapping 
generations (meaning discrete time), and populations so large that stochastic fluctuations may be 
neglected and a law of large numbers (or infinite-population limit) applies. Even this highly idealised 
setting leads to models that are notoriously difficult to treat and solve, namely, to large systems of 
coupled, nonlinear difference equations. Here, the nonlinearity is due to the random mating of the 
partner individuals involved in sexual reproduction. 
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Elucidating the underlying structure and finding solutions to these equations has been a chal- 
lenge to theoretical population geneticists for nearly a century now. The first studies go back to 
Jennings [l^ in 1917 and Robbins 17 1 in 1918. Building on [l3|, Robbins solved the dynamics for 
two diallelic loci (to be called sites from now on) and gave an explicit formula for the (haplo)type 
frequencies as functions of time. Geiringer [ll| investigated the general recombination model for 
an arbitrary number of loci and for arbitrary 'recombination distributions' (meaning collections of 
probabilities for the various partitionings of the sites that may occur during recombination) in 1944. 
She was the first to state the general form of the solution of the recombination equation (as a con- 
vex combination of all possible products of certain marginal frequencies derived from the initial 
population) and developed a method for the recursive evaluation of the corresponding coefficients. 
This simplifies the calculation of the type frequencies at any time compared to the direct evaluation 
through successive iteration of the dynamical system. Even though she worked out the method for 
the general case in principle, its evaluation becomes quite involved for more than three sites. 



Her work was followed by Bennett [5] in 1954. He introduced a multilinear transformation of the 
type frequencies to certain functions that he named principal components. They correspond to linear 
combinations of certain correlation functions that transform the dynamical system (exactly) into 
a linear one. The new variables decay independently and geometrically for all times, whence they 
decouple and diagonalise the dynamics. They therefore provide an elegant solution in principle, but 
the price to be paid is that the coefficients of the transformation must be constructed via recursions 
that involve the parameters of the recombination model. Bennett worked this method out for up to 
six sites, but did not give an explicit method for an arbitrary number of sites. The approach was later 
completed within the systematic framework of genetic algebras, where it became known as Haldane 



linearisation, compare IJ, Il5| . But, in fact, Bennett's program may be completed outside this 



abstract framework, as was shown by Dawson [9, 10], who derived a general and explicit recursion for 
the coefficients of the principal components. However, the proofs are somewhat technical and do not 
reveal the underlying mathematical structure. It is the aim of this paper to provide a more systematic, 
but still elementary, approach that exploits the inherent (multi) linear and combinatorial structure 
of the problem — at least for one particular, but biologically relevant, special case, which will now 
be described. Our special case is obtained by the restriction to single crossovers, which leads to what 
we call single-crossover recombination (SCR). This is the extreme case of the biological phenomenon 
of interference, and describes the situation where a crossover event completely inhibits any other 
crossover event in the same generation, at least within the genomic region considered. Surprisingly, 
the corresponding dynamics in continuous time can be solved in closed form [2, 3]. Again, a crucial 
ingredient is a transformation to certain correlation functions (or linkage disequilibria) that linearise 
and diagonalise the system. Luckily, in this case, the corresponding coefficients are independent of 
the recombination parameters, and the transformation is available explicitly. 



Motivated by this result, we now investigate the analogous single-crossover dynamics in discrete 
time. The paper is organised as follows. We first describe the discrete-time model and the general 
framework (Section [2]) and then recapitulate the essentials of the continuous-time model and its 
solution (Section 13]). Section |3] returns to discrete time. We first analyse explicitly the cases of two, 
three, and four sites. For two and three sites, the dynamics is analogous to that in continuous 
time (and, in particular, available in closed form), but differs from then on. This is because a 
certain linearity present in continuous time is now lost. The transformations used in continuous time 
are therefore not sufficient to both linearise and diagonalise the discrete-time dynamics. They do, 
however, lead to a linearisation; this is worked out in Sections [5] and |6l The resulting linear system 
has a triangular structure that can be diagonalised in a second step in a recursive way (Section[7]) . We 
summarise and discuss our results in Section [S] An explicit example is worked out in the Appendix. 
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2 Preliminaries and notation 

Let us briefly recall the recombination model described in fs'l and the special notation introduced 
there, as the remainder of this paper critically depends on it. A chromosome (of length n + 1, say) is 
represented as a linear arrangement of the n + I sites of the set S = {0, 1, . . . , n}. Sites are discrete 
positions on a chromosome that may be interpreted as gene or nucleotide positions. A set Xi collects 
the possible elements (such as alleles or nucleotides) at site i. For convenience, we restrict ourselves 
to finite sets Xi in this paper, though much of the theory can be extended to the case that each Xi 
is a locally compact space, which can be of importance for applications in quantitative genetics. A 
type is now defined as a sequence {xq,xi, . . . ,x„) £ Xq x Xi X • • • x X„ =: X , where X denotes the 
(finite) type space. 

Recombination events take place at the so-called links between neighbouring sites, collected into 
the set L = {^, |, . . . , '-j, where link a = is the link between sites i and i + l. Since we only 
consider single crossovers here, each individual event yields an exchange of the sites either before or 
after the respective link between the two types involved. A recombination event at link ^^^^ that 
involves x = (xg, . . . , x„) and y = (yg, . . . , j/„) thus results in the types (xq, . . . ,Xi, yi^i, . . . , y„) and 
(yg, . . . ,yi, Xi^i, . . . , a:„), with both pairs considered as unordered. 

Although one is ultimately interested in the stochastic process defined by recombination acting on 
populations of finite size, compare [4] and references therein, we restrict ourselves to the deterministic 
limit of infinite population size here, also known as infinite population limit (IPL). Consequently, 
we are not looking at the individual dynamics, but at the induced dynamics on the probability 
distribution on the type space X. Let V{X) denote the convex space of all possible probability 
distributions on X. As X is finite, a probability distribution can be written as a vector p = {jp{x))^^-^ , 
where p{x) denotes the relative frequency of type x in the population. 

Let us look at the time evolution of the relative frequencies Pt{x) of types x = (xg, . . . , 2:„) when 
starting from a known initial distribution pg of the population at time t = 0. In discrete time, it is 
given by the following collection of recombination equations for all x G X: 

+ " I] '°°)Pt(^)' with No, 

aeL 

where the coefficients p^, a & L, are the probabilities for a crossover at link a. Consequently, we 
must have Po. > and X^qgl Pa 1^ Ij where pa > is assumed from now on without loss of generality 
(when Pa = 0, the set X^_ i x X^_|_i can be considered as a space for an effective site that comprises 

i = a — i and i = a + When the p^ do not sum to 1, the remainder is the probability that 
no crossover occurs, which is taken care of by the last term in the equation. Moreover, [qJ ([a]) 
denotes the largest integer below (the smallest above) a and the star * at site i stands for Xj, and 
thus indicates marginalisation over site i. 

An important step to solve the large nonlinear coupled system of equations ([T|) lies in its refor- 
mulation in a more compact way with the help of certain recombination operators. To construct 
them, we need the canonical projection operator tTj : X — >■ X^, defined by a; i-^ = Xi as usual. 

Likewise, for any index set J C 5, the projector ttj is defined as tt j : X — > Xj := X i^j-^i- 
will frequently use 

7r<„ := TT^o ... LqJ} ^^'^ T^ya ■= '^{la-\,...,n}- 

These can be understood as cut- and- forget operators since they 'cut out' the leading and the trailing 
segment of a type x, respectively, and 'forget' about the rest. The projectors induce linear mappings 
from V{X) to V{Xj) by p ttj.p := p o ttJ"^, where ttJ"^ denotes the preimage under ttj and o 



(1) 
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indicates composition of mappings. The operation . (not to be confused with a multiphcation sign) is 
known as the pullback of ttj with respect to p. Consequently, n j.p is simply the marginal distribution 
of p with respect to the sites of J. 

Now consider recombination at link a, performed on the entire population. Since the resulting 
population consists of randomly chosen leading segments relinked with randomly chosen trailing 
segments, it may be described through the (elementary) recombination operator (or recombinator 
for short) : V{X) — > 7'(X), defined by p i-^ Raip) with 

where ® denotes the product measure and refiects the independent combination of both marginals 
TT^a-P £md 7r>„.p. Note that the recombinators are structural operators that do not depend on the 
recombination probabilities. 

Before we rewrite the recombination equations in terms of these recombinators, let us recall 
some of their elementary properties, see Q for proofs. First of all, the elementary recombinators 
are idempotents and commute with one another on 'P{X). This permits the consistent definition of 
composite recombinators 

rg ■■= n ^« (3) 

for arbitrary subsets G C L. In particular, one has R0 = 1 and Rs^a} ~ R-a- 

Proposition 1 On V{X), the elementary recombinators are commuting idempotents. For a < (3, 
they satisfy 

■^<a\Ri3{p)) ='^<a-P and 7r>„. (i?^(p)) = (7r{|-„i . . |_^j}.p) (g) (7r>^.p); (4) 
likewise, for a > /3, 

'^>a\Rl3(p)) = '^>a-P and 1^<a-{Rl3(.P)) = {-^Kli-P)® {'^{\0-],...,\_a\}-P) ■ (5) 

Furthermore, the composite recombinators satisfy 

RqRh = Rguh (6) 
for arbitrary G, H G L. □ 

These properties can be understood intuitively as well: 1^ says that recombination at or after link 
a does not affect the marginal frequencies at sites before a, whereas the marginal frequencies at 
the sites after a change into the product measure (and vice versa in ([5])). Furthermore, repeated 
recombination at link a does not change the situation any further (recombinators are idempotents) 
and the formation of the product measure with respect to > 2 links does not depend on the order in 
which the links are affected. As we shall see below, these properties of the recombinators are crucial 
for finding a solution of the SCR dynamics, both in continuous and in discrete time. 



3 SCR in continuous time 

Let us briefly review the SCR dynamics in continuous time, as its structure will be needed below. 
Making use of the recombinators introduced above, the dynamics (in the IPL) is described by a 
system of differential equations for the time evolution of the probability distribution (or measure), 
starting from an initial condition Pq at t = 0. It reads Q 

Pt = ^ Pa(^« - l)(Pt), (7) 

where is now the rate for a crossover at link a. Though ([7} describes a coupled system of nonlinear 
differential equations, the closed solution for its Cauchy (or initial value) problem is available 0, 01- 
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Theorem 1 The solution of the recombination equation ([7|) with initial value Pq can be given in 
closed form as 

Pt=Y. aG{t)RG{po), (8) 
GCL 

with the coefficient functions 

aoit) = H eM-Pct) l[{l-exp{-pf3t)). (9) 

aeL\G 0eG 

These are non-negative functions, which satisfy X^gcl '^gC*) = 1 for ol^^ t > 0. □ 

The coefficient functions can be interpreted probabiiisticaiiy. Given an individuai sequence in the 
popuiation, a,Q{t) is the probabiiity that the set of iinfcs that have seen at feast one crossover event 
untif time t is precisefy the set G. Note that the product structure of the aQ(t) impfies independence 
of finfvS, a decisive feature of the singie-crossover dynamics in continuous time, as we shaii see iater 
on. By ([5]), pt is aiways a convex combination of the probabiiity measures Rg{Pq) with GCL. 
Consequentiy, given an initiai condition Pq, the entire dynamics taltes place on the closed simplex 
(within V{X)) that is given by conv{RQ{pQ) \ G C L}, where conv{A) denotes the convex hull of A. 

It is surprising that a closed solution for the dynamics ([7]) can be given explicitly, and this 
suggests the existence of an underlying linear structure Q , which is indeed the case and well known 



from similar equations, compare 1^. In the context of the formulation with recombinators, it can 



be stated as follows, compare [sj for details 

Theorem 2 Let {c^''(t) \ C G' C L' C L} be a family of non-negative functions with c^q^ (t) = 
^o'l'^i^) '^ol'i'd' for any partition L = L1UL2 of the set L and all t > 0, where G^ := G C\ L^. 

Assume further that these functions satisfy "YIhizl' '^^h '(^) ~ ^ f^^ ^'^2' L' C L and t > 0. If 
V £ V{X) and H C L, one has the identity 



GCL GCL 
which is then satisfied for all t > 0. □ 

Here, the upper index specifies the respective set of links. So far. Theorem [2] depends crucially on the 
product structure of the functions c^-* (t) , but we will show later how this assumption can be relaxed. 
In any case, the coefficient functions a,G{t) of ([9]) satisfy the conditions of Theorem [2] The result 
then means that the recombinators act linearly along solutions (|8|) of the recombination equation ([7} . 
Denoting ipt as the fiow of Eq. ([T]), Theorem [2] thus has the following consequence. 

Corollary 1 On V{X), the forward flow of ^ commutes with all recombinators, which means that 
Rg ° 'Pt = 'fit ° Rg holds for all t > and all GCL. □ 

The conventional approach to solve the recombination dynamics consists in transforming the 
type frequencies to certain functions which diagonalise the dynamics, see [l,S0iE3| and references 
therein for more. From now on, we will call these functions principal components after Bennett 
For the single-crossover dynamics in continuous time, they have a particularly simple structure: they 
are given by certain correlation functions, or linkage disequilibria (LDE), which play an important 
role in biological applications. They have a counterpart at the level of operators on V{X). 

Namely, let us define LDE operators on V{X) as linear combinations of recombinators via 

'Tg := E (-1)'"'"'''^^^' withGCL, (10) 

HZ)G 



6 



Ute von Wangenheim et al. 



SO that the inverse relation is given by 



Rh =Y.Tg (11) 



due to the combinatorial Mobius inversion formula, compare [1]. Let us note for further use that, by 
Eq. ([6} in Proposition [U Tq o Rq = Tq. Note also that, for a probability measure p on X, Tq{p) 
is a signed measure on X\ in particular, it need not be positive. The LDEs are given by certain 
components of the Tq(j)) — see [3, 4j for more. In the continuous-time single-crossover setting, it 
was shown in Q that, if pt is the solution ([8}, the Tq{pi) satisfy 

d_ 



^TciPt) = -i P-)^o{Vt), foraUGCL, (12) 

\eL\G 



which is a decoupled system of homogeneous linear differential equations, with the standard expo- 
nential solution. That is, the LDE operators both linearise and diagonalise the system, and the LDEs 
are thus, at the same time, principal components. 

A straightforward calculation now reveals that the solution 1^ can be rewritten as 

Pt = J2 ^G{t)RG{Po) = E bK{t)TK{po) = J2 TxiPt), (13) 

GQL KQL KQL 

where the new coefficient functions are given by 

bK{t) := exp(^- E p„t 

aeL\K 

At this point, it is important to notice the rather simple structure of the LDE operators, which 
do not depend on the crossover rates. Moreover, the transformation between recombinators and 
LDE operators is directly given by the Mobius formula, see Ens. (IIOII and (|lip . This is a significant 
simplification in comparison with previous results, compare [a. lol. llO. where the coefficients of 
the transformation generally depend on the crossover rates and must be determined recursively. 

Below, we shall see that the SCR dynamics in continuous time is indeed a special case, and that 
the above results cannot be transferred directly to the corresponding dynamics in discrete time. 
Nevertheless, part of the continuous-time structure prevails and offers a useful entry point for the 
solution of the discrete-time counterpart. 



4 SCR in discrete time 

Employing recombinators, the SCR equations ([l} in discrete time with a given initial distribution 
Pq can be compactly rewritten as 

pt+i =pt+Y, p^{R^ - l)(pt) =: 'P{pt) ■ (14) 

As indicated, the nonlinear operator of the right-hand side of ()14p is denoted by <1> from now on. We 
aim at a closed solution of p4|) . namely for pf = ^*(po) with t G Nq. Based on the result for the 
continuous-time model, the solution is expected to be of the form 

Pt = 1>\po) = E aoimGiPo) , (15) 

with non-negative aQ{t), G Q L, "^^gcl '^Git) = li describing the (unknown) coefficient functions 
arising from the dynamics. This representation of the solution was first stated by Geiringer In 
particular, also the discrete-time dynamics takes place on the simplex coiw{Rq{pq) \ G C L}. 
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We are particularly interested in whether a discrete-time equivalent to Corollary [T] exists, that 
is, whether all recombinators Rq commute with (p. This is of importance since it would allow for 
a diagonalisation of the dynamics via the LDE operators (|10p . To see this, assume for a moment 
that Ra o = (J) o for all a £ L, and thus Rq o <P = ^ o Rq for aU G C L. Noting that, when 
a e G C H, Eq. ^ from Proposition[l]implies that {R^ — 1)Rh = RHu{a} ~ Rh = Rh~Rh = 0, 
we see that the assumption above would lead to 

Tao<l>= ^(-l)l«-«li?^o^>= ^(_i)!«-GI<z,o^^ 

HZIG H^G 

= E + E (-1)"'-°' E p^iRo. - i)Rh 

HOG HOG a<EL 

= 7^G+ E (-1)"'"'" E pc.{Ro.-i)Rh 

HOG aeL\G 

= (l- E Po^)Tg+ E p. E ^«uW + (-l)'^"^"^-°'i?«uw) 

aeL\G aeL\G HOG 

a^H 

= (l- E Pc)Tg, 

aeL\G 

SO that, indeed, all Tq(pi) would decay geometrically. This wishful calculation is badly smashed by 
the nonlinear nature of the recombinators, and the remainder of this paper is concerned with true 
identities that repair the damage. 

To get an intuition for the dynamics in discrete time, let us first take a closer look at the discrete- 
time model with two, three, and four sites. 



4.1 Two and three sites 

For two sites, one simply has S = {0,1} and L = {^}, so that only one non-trivial recombinator 
exists, R = Ri , with corresponding recombination probability p = Pi . Consequently, the SCR 
equation simplifies to 

Pt+i = <?(pt) = pR{pt) + (1 - (16) 
where pt is a jX [-dimensional probability vector. The solution is given by 

Pt = a{t)po + {l-a{t))R{po) (17) 

with ait) = a0{t) = (1 — p)*. This formula is easily verified by induction |l8| . Thus, in analogy with 
the SCR dynamics in continuous time, the solution is available in closed form, and the coefficient 
functions allow an analogous probabilistic interpretation. Furthermore, it is easily seen that the 
recombinators R0 = 1 and Ri = R commute with and therefore with for all t G Nq. For two 
sites, the analogue of Corollary [T] thus holds in discrete time. As a consequence, the LDE operators 
from (|10p decouple and linearise the system (|16p . At the level of the component LDEs, this is common 
knowledge in theoretical population genetics; compare [li Chap.3]. 

Similarly, the recombination equation ([T]) for three sites can be solved explicitly as well. An ele- 
mentary calculation (applying the iteration and comparing coefficients) shows that the corresponding 
coefficient functions aQ{t) follow the linear recursion 

-pi -P3 0\ / ais{t) \ 

2 2 \ 

Pi 1-P3 ai(^) 
pa 1 - Pi asit) 

2 2 2 

p| Pi 1/ \"{i.f}'^^^ / 



/ "0(^+1) \ 
ai(t + 1) 

2 
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with solution 

a0{t) = (1 - Pi - Pa)* 

ar_{t) = (l-p,)*-(l 

a.Jt) = (l-pi)*-(l 

ari 31 (<) = 1 - (1 - pa)* - 

I 2 ' 2 J 2 

These coefficient functions have the same probabilistic interpretation as the corresponding aQ{t), 
G C L, in the continuous-time model, so that aQ{t) is the probability that the links that have been 
involved in recombination until time t are exactly those of the set G. 

But there is a crucial difference. Recall that, in continuous time, single crossovers imply indepen- 
dence of links, which is expressed in the product structure of the coefficient functions aQ{t) (see (O). 
This independence is lost in discrete time, where a crossover event at one link forbids any other cut 
at other links in the same time step. Consequently, already for three sites, the coefficients of the 
discrete-time dynamics fail to show the product structure used in Theorem (2] 

But even though Corollary [T] concerning the forward flow of ([7]), is a consequence of Theorem [21 
which, in turn, is based upon the product structure of the coefficients, a short calculation reveals 
that Rq o <P = <P o Rq still holds for the discrete-time model with three sites for all G C { i , | } . As 
a consequence, just as in the case of two sites, the Tq linearise and decouple the dynamics, which is 
well-known to the experts, see Q for more. 

To summarise: despite the loss of independence of links, an explicit solution of the discrete-time 
recombination dynamics is still available, and a linearisation and diagonalisation of the dynamics can 
be achieved with the methods developed for the continuous-time model, that is, a transformation 
to a solvable system via the Tq. However, things will become more complex if we go to four sites 
and beyond. In particular, there is no equivalent to Corollary [T] i.e., in general, the recombinators 
do not commute with ^, and we have to search for a new transformation that replaces (jlOp. as will 
be explained next. 

4.2 Four sites 

The complication with four sites originates from the fact that R3 o<P <Po R3 , so that the property 
described by Corollary [1] for continuous time is lost here. Consequently, tlie Tq fail the desired 
properties. In particular, one finds 

T0mp)) = {I - PL - Pi - P^)T^{p) - PL P^T^ip), 

222 222 

so that an explicit solution of the model cannot be obtained as before. 

This raises the question why four sites are more difhcult than three sites, even though indepen- 
dence of links has already been lost with three sites. To answer this, we look at the time evolution 
of the coefficient functions aQ{t), G C L. For this purpose, let us return to the general model with 
an arbitrary number of sites. 

4.3 General case 

We now consider an arbitrary (but finite) set S with the corresponding link set L. For each G C L, 
we use the following abbreviations: 

G<a ■■= {i e G \ i < a} , G>„ := {i e G \ i > a} , 
L<a ■= {i & L \ i < a} ^ := {i L \ i > a} . 



Pi - Pa)* 
Pi - Pa)* 



(18) 



l-pir+ 1-pi 



PS. 
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Furthermore, we set 77 := 1 — X^aGL P<^' ^® then obtain 

Theorem 3 For all G ^ L and t £ No, the coefficient functions a,Q{t) evolve according to 

aoit+l) = r]aG{t)+Y. P'-i Yl «G<„u//(i)) ( Yl «^fuG^„W), (19) 

aeG HQL>a K(lL<a 

with initial condition aQ{0) = 5q 0. 

Proof Geiringer [11] already explained in words how to derive this general recursion, and illustrated it 
with the four-site example; we give a proof via our operator formalism. Using (|15p . the recombination 
equation for Pt+i reads 

Pt+i = X] «G(i + 1)J?g(po) = HPt) =^(Y1 «g W^?g(Po) 

GCL ^GCL 



+ 



X] Pa ( (7r<o. ( ^ aH{t)RH{Po))) !E) (ny^.(^J2 aK{t)RK{Po))) 

xSL ^ H<ZL K^L 

77 aG(.t)RG{po)) : 



where each product term in the first sum can be calculated as 

71"<a. ( Y °-H{t)RH{P(i)) ) ® ( 7r>„. ( ^ aK{t)RK{po, 



Y "■H{t)aK{t)iyn'<a-RH{Po)) ® y^>a-RKiPo) 
H,K<ZL ^ 

Yl "■H{t)aK{t)((n^a.RH^^UK^JPo)) <S> (jr^a-RH^^UK^^iPo)) 



= Y "■H{t)aK{t)iRa[RH^^UK^^(Pc 
H,KCL ^ 

where we use the linearity of the projectors in the first step, and Eqs. Q and dU from Proposition [T] 
in the second (more precisely, we use the left parts of Eqs. Q and reading them both forward 
and backward). Insert this into the expression for pt+i and rearrange the sums for a comparison of 
coefficients of Rq with GCL. Comparison of coefficients is justified by the observation that, for 
generic po a-nd generic site spaces, the vectors -Rg(Po) with GCL are the extremal vectors of the 
closed simplex conv{i?^(pQ) \ K C L}. They are the vectors that (generically) cannot be expressed 
as non-trivial convex combination within the simplex, and hence the vertices of the latter (in cases 
with degeneracies, one reduces the simplex in the obvious way). If G = 0, we only have ria0{t) as 
coefficient for R0. Otherwise, we get additional contributions for each a G G, namely, from those 
H, K G L for which -ff<a = G<a a-nd = G>q,, while Hy^ and K^^ can be any subset of L>q, 

and L<Q,, respectively. Hence, the term belonging to Rg{Po) reads 

aeG H<ZL>a KCL<a 

and the assertion follows. □ 
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The (t) have the same probabihstic interpretation as the Oq (t) ^ from the continuous-time model, 
and the above iteration can be understood intuitively as well: A type x resulting from recombination 
at link a is composed of two segments x^^ and Xy^- These segments themselves may have been 
pieced together in previous recombination events already, and the iteration explains the possible cuts 
these segments may carry along. The first term in the product stands for the type delivering the 
leading segment (which may bring along arbitrary cuts in the trailing segment), the second for the 
type delivering the trailing one (here any leading segment is allowed). The term r]aQ{t) covers the 
case of no recombination. 

Note that the above iteration is generally nonlinear, where the products stem from the fact that 
types recombine independently. This nonlinearity is the reason that an explicit solution cannot be 
given as before. 

A notable exception is provided by recombination events that occur at links where one of the 
involved segments cannot have been affected by previous crossovers, namely the links ^ and . 
In this case, at least one of the factors in Eq. (|19p becomes 1 (since, obviously, G<a = for ct = ^ 
and G>Q = for a = ) and the resulting linear and triangular recursion can be solved. The 
coefficients for the corresponding link sets can be inferred directly (proof via simple induction) as 

0.0 {t) = ?7* , 

aiit) = (?? + Pi) -v\ 

I ' \* t (20) 
a 2n-i (t) = ( 7j + p 2.1-1 1 — rj , and 

= - + Pi) -(^ + PliipL) + ('7 + Pi +P2!^) • 

This explains the availability of an explicit solution for the model with up to three sites, where we 
do not have links other than i and/or |, so that all corresponding coefficients can be determined 

explicitly. Indeed, one recovers (|18p with n = 2 and "'7 = 1 — p i — P 3 . 

2 2 

So far, we have observed that the product structure of the coefficient functions, known from 
continuous time, is lost in discrete time from three sites onwards; this reflects the dependence of 
links. In contrast, the linearity of the iteration is only lost from four sites onwards. The latter can be 
understood further by comparison of (|19p with the differential equations for the coefficients of the 
continuous-time model. These read: 

^aa{t) = -(^ Pa)aGit) + Y PaaG\{a}{t) , (21) 

aeL\G aeG 

that is, they are linear, with solution Note that this linear dynamics emerges from a seemingly 
nonlinear one, namely the analogue of (|19p . 

^5G(t) = -(^P„)HG(t)+ ^ p„( 2iG,„uH(t))( E 5KuG>Ji)). (22) 

However, due to the product structure of the solution, the product term in the second sum, when 
inserting ([9]), reduces to a single term, 

( J2 "G<„u//(i)) ( J2 o-KUG^Jt)^ = aait) +aG\{c}{t) , 

which turns ([22]) into (f2Tl) . 

What happens here is the following. From four sites onwards (namely, beginning with n = 3 and 
a crossover at a = |, and both in discrete and continuous time), it happens that leading and trailing 
segments meet that both possess at least one link that may possibly have seen a previous cut. When 
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a crossover at a takes place, the new joint distribution of cuts before and after a is formed as the 
product measure of the marginal distributions of cuts in the leading and trailing segments (cf. (|19p 
and (|22p ') — akin to the formation of product measures of marginal types by Ra- In continuous 
time, the links are all independent, hence the new combination leaves the joint distribution of cuts 
unchanged. Therefore, a set G of affected links (before and after a) is simply augmented by a if a is 
a 'fresh' cut; this results in the linearity of (|2ip . In discrete time, however, the dependence between 
the links, in particular between those in the leading and trailing segment, means that the formation 
of the product measure changes the joint distribution of affected links, in addition to the new cut at 
a; thus (|19p remains nonlinear. 

Since we aim at an explicit solution of the discrete-time recombination model, we need to find a 
way to overcome the obstacles of nonlinearity. Inspired by the results of the continuous-time model, 
we now search for a transformation that decouples and linearises the dynamics. 

To this end, we first investigate the behaviour of the Rq and Tq in the discrete-time model, since 
a deeper understanding of their actions will help us find a new transformation. We are still concerned 
with the LDE operators from the continuous-time model, because of their favourable structure and 
the existence of the inverse transformation (Mobius inversion). Moreover, as will become clear later, 
some of them still have the desired features and can be adopted directly for the discrete-time model. 
First, we need further notation. 

Definition 1 Two links a,f3 £ L are called adjacent if |a — /3| = 1. We say that a subset L C L is 
contiguous if for any two links a, (3 & L with a < 13, also all links between a and (3 belong to L (this 
includes the case L = 0). A non-empty contiguous set of links is written as L = {^min, • • • , ^max}- 

Whereas, according to Theorem[2l all recombinators act linearly on the solution of the continuous- 
time recombination equation, this does not hold for the solution of the discrete-time model in general, 
though the following property still holds. 

Lemma 1 Let {cq \ G C L} he a family of non-negative numbers with "^qc^l — ^- ^or an arbi- 
trary V £ 'P{X) and for all K C L with L\K contiguous, one has 

GCL GCL 

Proof When K = 0, the claim is clear, because R0 = 1 and L itself is contiguous. Otherwise, we 
have K = AUB with A := {i, |, . . . , a} and B := {/3,/3 + l, . . . , ^^^} for some f3 > a (this includes 
the case K = L via f3 = a + 1). Since we work on V{X), we have Rj^ = RgRj^ from Proposition [U 
With the projection tTj. : V{X) V{X^) onto a single site i, we obtain 

'^gRg{v)^ = 5Z '^GT^i-^Giv) = X] CG'^i-V = T^i-v , (23) 

GCL GCL GCL 

since n^. is a linear operator and i^iMaiv) = w^.v by Proposition [T] For the contiguous set A and 
w := JI^GCL Cg^g(''^)i obtain, with the help of (|23p and a repeated application of Proposition [U 

'=G-RG(^')) = ttq.w (g) ■ ■ ■ (g) 7r^_^^.w (g) nya-w 

GQL 

= ttq.v (g) • • • n^^^.v g)Tr^a.w = ^ caiT-Q.v ■ ■ ■ (g) n^^^.v ■k>c-Rg{'u)) 

GQL 

= Y CG{no.RG{v)g)---(gn^ayRa{v)®Trya.RGiv)) = ^ cg-Raug(^^) ■ 

GCL GCL 



An analogous calculation reveals Rg (J^qcl '^gRaug{''^)) = JI^gcl cg^aubug(''^) for contiguous B. 
This proves the assertion. □ 
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The intuitive content of Lemma [T] falls into place with the explanation of Theorem |3] The linearity 
of the particular recombinators of Lemma [T] is due to the fact that Rj^ produces only one segment, 
namely L \ K, that might be affected by previous recombination events while all other segments 
consist of only one site and thus cannot bring along cuts from 'the past'. 



5 Reduction to subsystems 

In this section, we show a certain product structure of the recombinators and the LDE operators. 
This will turn out as the key for constructing an appropriate transformation. Recall that a crossover 
at link a 6 L partitions S into {0, . . . , [aj } and { [a] , . . . ,n}. In general, recombination at the links 
belonging to G = {«!,..., } C L, < < •■■ < ct|G|' induces the following ordered 
partition Sq = {Jq , J^^ , ■ ■ ■ , J°g\ } of S* (see Fig. [1]): 

= {0, . . . , [aij } , J? = { \ai \ [a2j } , • • • , J|g| = { ["IGll ,■■■,"}■ 

Note that the partition is ordered due to the restriction to single crossovers. In connection with this, 
we have the sets of links that correspond to the respective parts of the partition Sq (Fig.[T]). Namely, 
for G = C L, Cq ■= {l^ , I? , . . . , I^g\} with 



= {a e L : i < a < ai} , I^^^ = {a e L : a^G\ <a< 
and li = {a e L : ai < a < a^+i} for 1 < ^ < |G| - 1 



(24) 



specifies the links belonging to the respective parts of Sq: the links associated with £ Sq, 
< k < |G|, are exactly those of ijf G Cq (and vice versa). 



h h = <^ 



n 



1357911131517 
22222 2222 



"'o -^3 

Fig. 1 A system with 10 sites (i.e., S = {0, ... ,9}, L = {|, ^}) cut at the links G = {§, ^, ^} (broken 
Unes). The resulting subsystems are Sq = { Jq, . . . , Jg} and Cq = {/q, . . . , /g} with Jq = {0, 1, 2}, J-^ = {3, 4, 5, 6}, 
Jj = {7} and Jg = {8,9} as well as 1^ = {|, |}, = {|, |, I2 = ^ and /g = {^} (the upper index G is 
suppressed here for clarity). 

With this definition, = is possible for each < i < |G| and will be included (possibly 
multiply) in Cq. Furthermore, C0 := {L}, so that 1^ = L. The upper index will be suppressed in 
cases where the corresponding set of links is obvious. Clearly, Cq is not a partition of L, whereas 
Sq is a partition of S. 

This way, recombination at the links in G C L produces several 'subsystems' (characterised 
through the sites J^. and the corresponding links Ij^, < k < |G|) with respect to the 'full system' 
described through the sites S and the links L. We demonstrate below that it is sufficient to consider 
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these subsystems separately, a property that reduces the problem of dealing with the recombination 
dynamics. Note first that repeated application of (HI and © leads to 

7r<a--RG^^(p) = ^?G^_,";°V<a P) and 7r>„._R*5'' (p) = .Rq^^^"' (7r>„.p) , (25) 

where R^q^ is our usual recombinator acting on V{X) = V{Xq x • ■ • x X„), and i?^^"'* denotes 
the respective recombinator on V{Xq x • • • x which acts on the subsystem specified through 

the sites L<q and cuts the links G<q, (and analogously for _R^>°'). Likewise, recombinators 
H C /j, acting on V{Xj.), may be defined for all subsystems, < i < \G\, in the obvious way. For 
consistency, we define -R^' = 1. From now on, the upper index specifies the corresponding system 
the Rq (and, likewise, the Tq) are acting on. It will be suppressed in cases where the system is 
obvious. We now explain the inherent product structure of the recombinators: 

Proposition 2 Let G C L. For each a £ G and p £ V{X), one has the identity 

Rg\p) = {Ra\r\^<o.-p)) ® {Ra::H^>o..p)) ■ 

Proof For a £ G, Proposition [l] implies : 

r'-^\p) = i?i^)(4^^(p)) = (7r<„.4^)(p)) ® (7r>„.4^)(p)) 

= (i?L';rV<a.p)) ® (i?[f;;V>a.p)) , 

where the last step follows from ()25|1 . □ 
This proposition carries over to the LDE operators: 
Proposition 3 On V{X), the LDE operators satisfy 

T^g\p) = (4^//^(7r<„.p)) ® (T^^^>"^(7r>„.p)) for all aeG, 

where Tq^^"^ and Tq now describe the operators acting on the simplices V{Xq X • • • X X^q,j ) and 
'P(X^Q,-| X ■ ■ ■ X X„), respectively. 

Proof Let a £ G. Using the product structure from Proposition [2] and splitting the sum into two 
disjoint parts, one obtains 

T^'\p) = E = E (-i)'"-°'(«r'(-<..p)) ^ K;:^(->..p))) 

H^G HOG ^ 

E (-i)i--°\^">i («<:\.<..p)) ® «>:\->„.p))) 

L\{a}2H2G\{a} ^ ^ 

E (-i)i"<"-'=<"i E (-i)"">"-^-"f<r^(-<..p)^<?'(->..p)) 



E (-i)i"<-°<"i«<:\-<..P))) 



i E (-i)"">"-°>°"K;:'(->..p))) 

which establishes the claim. □ 
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Using this argument iteratively on the respective segments, one easily obtains 

= {tL'"\^j„.p)) » (t^'^^K.p)) ® • • • ® (T*,''^''(7rj,,,p)) , (26) 

where the upper index specifies the corresponding subsystems associated with G, compare (|24p . 
Hence, the effect of Tq on the full system is given by that of T0 on the respective subsystems 
corresponding to G. 

Our goal is now to study the effect of the Rq and Tq on <P, the right-hand side of the recombi- 
nation equation (|15p . This will show us in more detail when and why the LDE operators from the 
continuous-time model are not sufficient for solving the discrete-time model and, at the same time, 
will direct us to the new transformation. 

If denotes the right-hand side of the recombination equation on the full simplex V{Xq x 
• • • X Xn), then, for any contiguous / = {a, . . . , /3} C L, the right-hand side of the recombination 
equation on the subsimplex 7^(^[aj X ■ ■ ■ X X^^^ ) will be denoted with $^^K Again, we suppress the 
upper index when the simplex is obvious. 

Proposition 4 For the right-hand side of the recombination equation, 
one finds 

for every a £ L and p G 'P{X). 

Proof Since R^^\$^^\p)) = (n^^.{^^^Hp))) {■^>o.-i^''^Hp))) , we obtain with the help of 

/3<a /3>a 
/3<a 

Analogously, one obtains 7r>Q. (^^^^''(p)^ = ^^^="°H''''>a-P)' ^-^^d the assertion follows. □ 
More generally, this theorem implies inductively that 

ii(j^'(<|.(^'(p)) = (<l>(^o)(^^^ p)) ^ {^^'^\nj^.p)) ® • . • ® {<l>^'^-^\nj^^.p)) . (27) 

Finally, for the interaction between the T^^' and we have the following result. 

Proposition 5 For the LDE operators (|10p and all G C L, one has 

t(^)(<|>W(p)) = {T^J"\^^'^\7rj^.p))) ® (T<'^'(f<'^'(7r,^.p))) (T^''^'\f<'i-i'(^j,,,p))) , 

with /q, . . . , according to p4p . 
Proof Using (|27p and (|26p . one calculates 

T(^)(<|>W(p)) = T(^)(i?W(^>(^)(p))) = T(^)((<l>(^o)(,_^^.p))^...^(^(^,«,)(,^^^^.p))) 

= (t<'°^(<?''"'(-.„.p))) ^---^ (7i''^'\f(^i-i'(7r,,^,.p))) , 
which establishes the formula. □ 
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This result is of particular significance since it shows that, to determine the effect of the T^, on ^, 
it is sufficient to know the action of the T0 on the subsystems that correspond to G. Hence, we now 
need to determine T0 o <P. It will turn out that this relies crucially on the commutators of Rq with 
f?, which will be the subject of the next section. 

6 The commutator and linearisation 

The more algebraic approach of [2], which was later generalised by Popa [l^, suggests to further 
analyse the problem in terms of commuting versus non-commuting quantities. For GCL, the 
commutator is defined as [Rq,'!'] '■= Rq o <P — <P o Rq. Recall that, in the continuous-time model, 
the linear action of the recombinators on the solution of the differential equations entails that the 
corresponding forward flow commutes with each recombinator (see Corollary [T|). But this no longer 
holds for discrete time: [ Rq , ^ ] = is not true in general. We are interested in the commutators 
because — as we will see in a moment — they lead us to the evaluation of T0 o (p, and this in turn 
gives Tq o <P (see Proposition [3]). 

Proposition 6 Let 77 = 1 — "^^aeL '^^ before. On ViX), one has 

T0O<P = rjT0+ 

GCL 

Proof Expressing the left-hand side as 

T0O<P= ^(-l)|G|(i?oo^.) = ^(-l)|G|(^.oi?G)+ ^(-l)|G|[i?G,<?] , 

GCL GCL GCL 

and using <1> = rjl + X]a6L Pa Raj one calculates 

^(-i)|«|(<?oi?G) = E(E(-i)'°'po^«^G) + r7E(-i)""^G 

GCL aGL GCL GCL 

= VT0+J2J2 ((-l)""PaJ?.J?G + (-l)"'''^"^'p«J?Guw) 

a6L GCL 
a^G 

= VT0+J2{J2 {-^)^''^Pa{RaRG - RGU{a})) = , 
QgL GCL 
a^G 

which shows the claim. □ 

Proposition [6] shows that T0 only yields a diagonal component if all recombinators commute with 
<l>. We now need to determine the commutator [i?Gi^]- To this end, it is advantageous to introduce 
a new set of operators. 

Definition 2 For G C K C L, we define the operators 

Tg,k ■■= E (28) 

GQHQK 

Equivalently, for any M C L \ G, this means that 

Tg.gum = E = E (-1)"" W . 

GQHQGUM KQM 
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These operators act on the fuU simplex and can be interpreted in analogy to the original LDE 
operators (|10p . where the links in the complement of GUM (the disjoint union of G and M) are 
regarded as inseparable. If necessary, we will specify the system the operators are acting on by an 
upper index as before. 

Lemma 2 On V{X), the operators (|28p satisfy 

Tg,G = Rg and = Tq ■ 

They have a product structure, 

T^%h(p) = {Ti%,oinjo.p))^{f'J^J^^Un^^ (29) 
for all H C L\G. Moreover, one has 

Tg.GOm = ^ "^h = ^ TQtjji (30) 

GQHC1L\M K(1L\{MUG) 

for all G, M G L with G Cl AI = 0. Consequently, Mobius inversion returns Tq as 

Tg= E i-^y"~°^TH,HuM- (31) 

G(1HC1L\M 

Proof The first assertion is obvious; the second is analogous to (|26p and follows along the same lines. 
Relation (|30p is true since 

Ta,GuM = E (-1)"''^GUK = E (-1)"" E Th 

KQM K(ZM HDGUK 

= E E(-i)'"'^- = E^- E (-1)'"' 

HDGKCM HDG KGMnH 

KQH 

= E ^MnH,0 Th = E ■ 

H^G G(ZH<ZL\M 

In the second-last step, we used that, if if is a finite set, one has 

E (-1)"" =^^^.0. (32) 

which is the key property of the Mobius function of ordered partitions. □ 

Before we turn to the commutator, we introduce a new function, the separation function, which 
will allow for a clear and compact notation. 

Definition 3 For G, H C L with G Cl H = 0, we say that G separates H if, for all a, f3 H with 
a < f3, there is a 7 G G with a < 7 < /3. Hence, we define the separation function as 



sep(G',ff) 



1, if G separates H, 
0, otherwise. 



In the particular cases H = and H = {a}, a £ L, we define sep(G, H) = 1 for all G C L, and it is 
understood that sep(G, H) = whenever G O H 0. 

First, let us summarise some elementary properties of the separation function. 

Lemma 3 The separation function sep(G, H) with H C L \ G has the following properties: 
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1. sep(G', -ff) = 0, if -ff contains any adjacent links; 

2. sep(G', H) = implies sep{G' , H) = for all G' C G; 

3. sep(G, H) = whenever L \ G is contiguous with H C L\G and \H\ > 2; 

4. sep(G, H) = l implies /f n G ^ for aU z £ {1, . . . , \H\ - 1}. □ 
Later, we need the following summation formula for the separation function. 

Lemma 4 Let H,K C L with H ^ 0, H n K = , and //^ defined as in Then 

J2 (-l)l°lsep(G,i7) = sep(K, 77) (-1)1^1-1 5^n/«.0W«|,0- 

GQK 

Proof For sep(_ft', = 0, the claim is clear by Lemma |3j2). We now define Ai := K n Ip for 
aU i e {0,...,\H\}. Then, for aep{K,H) = 1, it follows from Lemma [3i;4) that yl^ / for all 
1 < j < \H\ — 1. Likewise, since G C K, sep{G,H) = 1 if and only if G n /f ^ for all 
^ ^ j ^ \H\ — I, with no condition emerging for G Pi Iq or G n /|^| . This gives 

^(-l)l°lsep(G,i7) = ^ (-l)l^ol'j^"( ^ (_i)|BJ) ^ J^l^^ 

GQK BoQAg 1 = 1 B^<ZA^ B.^^.^A.^j, j=0 

Here, for j = and j = |_ff|, the factors Fj are given by Fj := "^^b ca (—1)'^^' = ^Aj,0, where we 
have used For 1<3<\H\-1, 

Fj ■■= E (-1)1^^1 =-1+ E (-l)""^' =-1 + 5a,.,0 = -1, 



where we have again used (|32p in the second-last step, and Aj ^ in the last. □ 

With this notation, let us take a closer look at -R^f ^ (p)) for G C L. Evaluating (|27p explicitly, 
using Definition [21 expanding and using the product structure (|29p backwards gives 

4^)(<l>(^)(p)) = (7rj„.p+ E P.„(<°^-1^'''')K.P) 
= E {-iy"^sepiG,H)pHf^%H{p), 

H<ZL\G 

where, in the last step, we have further set pu = IlaeH Pa for all if C L (in particular, p0 = 1) and 
used Lemma |3l^4); note that the separation function is basically used as an indicator variable here. 
On the other hand, we obtain 

<PoRa= E PaRaRo + i^- E P°')^G 

aeL\G aeL\G 

= Ta,G - E PaTG,GO{a} 
aeL\G 

= sep(G, 0)fG,G - E sep(G, {oDp^Tg (3u{a} I 

aeL\G 

which finally yields the commutator. 
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Theorem 4 For all G L, the commutator on V(X) is given by 

[i?G,f ] = E (-l)""sep(G,if)p^^fG,GuH- 

HCL\G 

\H\>2 

□ 

Please note that, by the properties of the separation function, many of the summands vanish. In 
particular, [Ra,'^] = whenever |L\G| < 1. 

Corollary 2 [ Rq ,<P] =0 if L\G is contiguous. 

Proof By Theorem |4l only terms with \H\ > 2 need be considered. For these. Lemma |3l^3) tells us 
that sep(G, _ff) = if L \ G is contiguous and H C L\G. Hence, [Rq,(1>] =0. □ 

Let us note in passing that the converse direction of Corollary [2] may fail if the site spaces are 
sufficiently trivial. Nevertheless, in the generic case, [-Rgi^] = implies sep(G,-H') = for all 
H <Z L\G with \H\ > 2, because the relevant terms then cannot cancel each other. We omit a more 
precise discussion of this point, because we do not need it later on. 

Recalling that ^* is the discrete-time analogue of ipf, we can consider Corollary [2] as what is 
left of Corollary [1] in discrete time. Hence, it becomes clear why the LDE operators (|10p from the 
continuous-time model do not suffice to linearise and decouple the discrete-time dynamics. 

We still aim at determining o<? according to Proposition|6l expressing the commutator [ Rq , (p ] 
in terms of the Tq (which are related to the q^^j via pop V 

Theorem 5 On V{X), the operators Tq = T^^ and <P = <?(^' satisfy 

T^^)o<?(^) = E z^^\G,K)T'i^^ (33) 

for all G ^ L. The coefficients z^^\0 , K) , K C L, are given by 

z(^'(0,0) = 1- (34) 

and, for K 0, by 

z^'^\0,K) = - E PHsep(7f,ff)(l-5^n/-.0)(l- (35) 

H(1L\K 

For K D G ^ , the coefficients are recursively determined by 

z''^\g,k) = z^'°^ (^0,Knio) ■...•2'-^i^i^(0,i^n/|G|). 

Proof Let us first prove the case G = 0. According to Proposition [S] we have T0 o <P = r]T0 + 

5]^g'cl(~1)'^ where 77 = z^^\0,0) by definition. Let us thus evaluate the last term. In 

the first step, we insert the commutator from Theorem HJ we then use Definition [2] and change the 
order of summation to arrive at 

E (-l)"'''[fiG'''^] = E (-1)'°'' E {-l)^"^^ep{G' ,H)pHfQ,Q,^^ 
G'QL G'QL H<ZL\g' 

\H\>2 

= E (-l)""'' E (-l)""sep(G',if)pK E 

G'QL HQL\g' G'QKQL\H 

\H\>2 

= Y.Tk (-1)""PH E (-l)'°''«ep(G',i/), (36) 

KQL H<ZL\K G'CK 
\H\>2 



Single-crossover recombination in discrete time 



19 



which does not contain any term with T0 . We can now compare coefficients for . Note first that, by 
psp . we oniy need to consider sets H C L\K, that is, HDK = 0. In this case, Sj^^j,-, = 
and Sr^^rH PI = 1 — Sf,„rK p<- This is true since Knl^ = 0) implies that the smallest element 

in H is smaller (larger) than the smallest element in K, thus H n (=0) (and vice versa for 

K n I\H\)- Taking this together with Lemma H) the coefficient of in (|36|) turns into 

E E (-l)"'''sep(G',ff) 

H<ZL\K G'QK 
\H\>2 

= - E PH^ep{K,H){l-5Hni^,0){'^-^HfMll^,0) 

HCL\K 
\H\>2 

= - E PHsep{K,H){^-SHnI^^,0)i^-5HnI|i,^,0)■ 

H<ZL\K 

Note that, in the last step, the restriction on \H\ may be dropped since it is already implied by the 
factors involving the (5-functions. This proves the claim for G = 0. For the case G 7^ 0, we follow 
Proposition [5] and write, for p £ ViX), 

T^^\<P^^\p)) = {T^J"\<P^'"\7rj^.p))) ® (T<'^'(f<'^'(7r.,p))) ^---^ (T^''^'\<l'<^i-i^(7r^,^ .p))) . 

Applying the above result for G = to each factor, and using the product structure of Proposition [5] 
backwards, establishes the claim. □ 

Corollary 3 The coefficients z{0,K) with K ^ can be expressed explicitly as 

z^^\0,K) = - E Poo('n (1+ E P",)) E P",.,- (37) 



Proof Let us consider those H whose contribution to the sum in psp is not annihilated by the 
separation function or the (5-functions. For sep(iir, H) = 1 to hold, each a £ H must belong to a 
different G Cj^. Furthermore, H must contain one element each from and (qq and a^p^^, 
respectively) to keep the factors involving the (5-functions from vanishing. Thus, the sum in (|35p may 
be factorised as claimed. □ 

In particular, z(-^)(0, K) = ii K D ^^T^} / Taking this together with (jSll), one obtains 
2:^^^(0, K) = (1— Pa) 5k,0 ioT K C L whenever \L\ < 2, and hence, in these cases, T^^^ocf'^^^ = 
(1 — X^Qgi ^0)7*4^^ is already a diagonal component in line with the observation in Section 4. 
Furthermore, Theorem [5] and p7p entail that z^'"\G,K) = whenever 

y {min(/f),max(/f)}) ^0. (38) 

0<i<\G\ 

Theorem [5] reveals the linear structure inherent in the action of Tq on (p. In fact, the structure is 
even triangular (with respect to the partial ordering) since o (^(^) is a linear combination of the 
K D G. Thus, diagonalisation will boil down to recursive elimination. As a preparation, we 
make the following observation. 

Corollary 4 If L ^ , one has the relation z^^\G, L) = for all C G C L. 

Proof When C G C L, the intersection in (|38p . with K = L, can never be empty, so that 
2^^^(G,L) = follows. □ 
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7 Diagonalisation 

Motivated by the triangular structure of p3p . we make the ansatz to define new operators C/q, 
G C L, as the fohowing linear combination of the well-known Tq: 

Ug = Yl ""(G, H) Th , (39) 

HDG 

where the coefficients c(G, H) are to be determined in such a way that they transform the recombi- 
nation equation into a decoupled diagonal system, more precisely so that 

Uqo^ = XaUa , G CL, (40) 

with eigenvalues Xq that are still unknown as well. An example for this transformation can be found 
in Appendix A. Note first that, with the help of psp . Eqs. p9p and ()40p may be rewritten as 



(41) 



Uao<P = c{G, G)Tao<P+ ^ c(G, N) (Tjv o <P) 

NZ)G 

= c{G,G)[z^'^\G,G)Tg+ J2 z^''\G,K)Tk) 

KDG 

+ c{G,N)[z<-'^\N,N)Tr,+ ^ /'^\n,M)Tm) 

NDG MDN 

= XG{c{G,G)Ta+ J2 c(G,iV)Tjv) = XgUq- 

NDG 

Obviously, there is some freedom in the choice of the c(G, G); we set c(G, G) = 1 for all G C L (and 
we will see shortly that this is consistent). Eq. ()4ip has the structure of an eigenvalue problem of 
a triangular matrix with coefficients z^'^\G,H), where the role of the unit vectors is taken by the 
Tfj, and the c{G,H), H D G, take the roles of the components of the eigenvector corresponding 
to Xq (note that, by considering c(G, H) for H D G only, we have already exploited the triangular 
structure). Recall next that the eigenvalues of a triangular matrix are given by its diagonal entries, 
which are 

|G| ^ |G| 

Ag = z(^'(G,G) = n^^'* ^(0.0) = n(l- E P-O (42) 

by Theorem[5] In particular, A0 = 77 = 1 — X^agL Pa ^ 0- The Xq describe the probability that there 
is no further recombination between the respective sites of the subsystems corresponding to G; they 
have already been identified by Bennett [sl and Dawson 0, . 

Lemma 5 For all G, H C L with G C H, one has Xq < Xfj. 

Proof Let C G C L. Then, for 77 = G U {/3}, with (3 e 1° for an arbitrary z £ {0, . . . , |G|}, we see 
from (|12|) that z^^\H, H) = X^ and hence obtain 



n(i- E E pa){^- E P'^)C^\ (1- E 



1 - Eo^e/f Po,) (1 - Ec«,g/f Pa 
a,<[3 a,>P 

1 - Ea,e/f Pa. 



> Ar 
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because all are positive, as are all three terms in parentheses of the fraction, and > by 
assumption. Finally, the argument also works for A0 = 7;, provided 77 > 0. Since Aq > for all 
G ^ 0, the claim trivially also holds for 77 = 0. The assertion then follows inductively for any 
H^G. □ 

The coefHcients c(G, H) can now be calculated recursively as follows. 
Theorem 6 The coefficients c{G, H) of 1)39^ are determined by c(G, G) = 1 and 

for H ^ G. The coefficients of the inverse transformation of (|39p . 



To = Yl '^*('^' ^) , with GCL, (44) 

are determined by 

c*{G,K) = - Yl c*{G,H)c{H,K), (45) 

KDH^G 

for K D G together with c*{G, G) = 1. 

Proof Considering (|4ip with c{G, G) = 1, comparing coefHcients for Tjj, H D G, and observing (|42|) . 
one obtains 

2^^' (G, H) + c(G, H)\h+ J2 ^) ^) = ' 

HDKDG 

and the recursion for c(G, H) follows. It is always well-defined for all H ^ G, since Aq < A^ by 
Lemma [5l The recursion for the coefficients of the inverse transformation follows directly from 

Tg = E ^*iG,H)UH = E c*(G,ff) Y <H,K)Tk = Y.Tk c* {G, H) c{H, K) , 

H^G HZ)G KZ)H KDG KDHDG 

which enforces ^KZ3H^G c*(G', H) c{H, K) = q, as the Tj^ are distinct. □ 

We now identify those Tq that already give diagonal components of the discrete-time system: 
Theorem 7 For all G C L that satisfy |/f | < 2 for allie {0,..., \G\}, one has 

Tg{Hp)) = ^gTg{p) 

forpe V{X). 

Proof In this case, we have psp for all K D G, hence z{G,K) = XaSj^ q, from which the assertion 
follows via Theorem [S] □ 

Note that < 2 for all G Cq simply implies that each subsystem consists of at most three 
sites, hence all subsystems can be reduced to the simple cases considered in Section|4l Then, for such 
G, c{G,H) = c*{G,H) = 5a, H for all HDG. 

With the help of this transformation, we can finally specify the solution pt of the recombination 
equation in terms of the initial condition Pq. To this end, we first use the transformation pip from 
the recombinators to the Tq operators, and then relation (|44p to arrive at the Ufj operators, which 
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finally diagonalise the system according to (|40p . Finally, we use the appropriate inversions to return 
to the recombinators: 

Pt = <2'*(Po) = R0{^'{Po)) = E Tg{^\po)) = E E c*{G,H)Uh{'P\po)) 

GQL GQL H^G 

= E E c{G,H)X%UHiPo) = E E c*(G,i?)A*K E <H,M)Tm{pg) uq) 

GCL HDG GCLHDG MDH 

= E E c*(G,^)A*« E ^(H,M) E i-lf-^'^RTiPo)- 

G<ZL HZ)G MZ)H T^M 

The coefficient functions can now be extracted as follows. 

Theorem 8 The coefficient functions aQ{t) of the solution psp of the recombination equation in 
discrete time may he expressed as 

acit) = E (-l)""-'" E E c{H,M)X'hc*{K,H) 

MQG HQM KQH 

for all G C L. Here, c{H, M) and c* {K, H) are the coefficients of Theorem\^ □ 

To derive the asymptotic behaviour for large iteration numbers, we need the following property 
of the coefficients. 

Lemma 6 The coefficients c{G,L) and c*{G,L) satisfy c{G,L) = c*{G,L) = 5q for arbitrary 
CG CL. 

Proof We have c(G, G) = c* (G, G) = 1 for aU G by Theorem H The claim for c(G, L) now follows 
from the recursion ()43p together with Corollary |4l Inserting this into recursion (|45|1 establishes the 
relation for the c* (G, L) . □ 

As an example, the path to a solution via the above chain of transformations for the model with 
five sites will be presented in Appendix A. 

Finally, let us consider what happens in the limit as t — > oo. 

Proposition 7 The solution pf of the recombination equation (|14p with initial condition pp satisfies 

n 

Pt Rl{Po) = 0(7I"i.po) , 

with exponentially fast convergence in the norm topology. 

Proof When expressing pi in terms of U[j according to (|46p . we first observe pf = U^{pq) + 
J^GdL^H^G ^) -^^H Uh{po)j because = 1 and c*{G,L) = 5qj^ by Lemma El Since 
Uj^ = Rj^ , we obtain the following estimate in the variation norm 

\\Pt-RL{Po)\\ = \\ E E C*{G,H)X'HUHipo)\\ 
GCL HDG 

^ E ^*^^|| E c*{G,H)Uh{po)\\^=^0, 

HCL GCH 

which establishes the claim since A^f < 1 for H ^ L. □ 

As was to be expected, the solution of the recombination equation converges towards the independent 
combination of the alleles, that is towards linkage equilibrium. 
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8 Discussion 

In this paper, we have investigated the dynamics of an 'infinite' population that evolves due to re- 
combination alone. To this end, we assumed discrete (non-overlapping) generations, and restricted 
ourselves to the case of single crossovers. Previous results had shown that the corresponding single- 
crossover dynamics in continuous time admits a closed solution Q. This astonishing result is con- 
cordant with a 'hidden' linearity in the system that is due to independence of links. The fact that 
crossovers at different links occur independently manifests itself in the product structure of the 
coefHcient functions of the solution ensuing from the linear action of the nonlinear recombination 
operators along the solution of the recombination equation. Additionally, in [3], a certain set of 
linkage disequilibria was found that linearise and diagonalise the dynamics. 

Since the overwhelming part of the literature deals with discrete-time models, our aim was to 
find out whether, and to what extent, these continuous-time results carry over to single-crossover 
dynamics in discrete time. We could show that the discrete-time dynamics is far more complex than 
the continuous-time one, and, as a consequence, a closed solution cannot be given. 

The main reason for these difficulties lies in the fact that the key feature of the continuous-time 
model, the independence of links, does not carry over to discrete time. This is due to interference: 
The occurrence of a recombination event in the discrete-time model forbids any further crossovers in 
the same generation. In connection with this, the recombinators do not, in general, act linearly on 
the right-hand side of the recombination equation. Likewise, the coefficient functions of the solution 
follow a nonlinear iteration that cannot be solved explicitly. 

While Geiringer pjj] developed a skilful procedure for the generation-wise evaluation of these 
coefficients, we constructed a method that allows for an explicit formula valid for all times, once the 
coefficients of the transformation have been determined recursively for a given system. 

As in previous approaches, this is achieved by a transformation of the nonlinear, coupled system 
of equations to a linear diagonal one. This was done before by Bennett and Dawson for 
the more general recombination equation (without restriction to single crossovers), and they pre- 
sented an appropriate transformation that includes parameters that must be determined recursively. 
Unfortunately, the corresponding derivations are rather technical and fail to reveal the underlying 
mathematical structure. It was our aim to improve on this and add some structural insight. Unlike 
the previous approaches, we proceeded in two steps: first linearisation followed by diagonalisation. 
More precisely, it turns out that the LDE operators Tq, which both linearise and diagonalise the 
continuous-time system, still linearise the discrete-time dynamics, but fail to diagonalise it for four 
or more loci. However, the resulting linear system may be diagonalised in a second step. This relies 
on linear combinations Uq of the Tq, with coefficients derived in a recursive manner. 

As it must be, the transformation agrees with the one of Dawson llo| when translated into 
his framework. (Note that our c{G,H) are coefficients of T^, whereas his coefficients belong to 
components of Rfj{p). Note also that SCR does not belong to the singular cases he excludes). It 
remains an interesting open problem how much of the above findings can be transferred to the 
general recombination model (i.e. without the restriction to single crossovers), where one loses the 
simplifying structure of ordered partitions. 
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matics, and Dutch-German Bilateral Research Group on Mathematics of Random Spatial Models in Physics and 
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Appendix A: Five Sites 

To illustrate the construction, let us spell out the example of five sites. We have S = {0, 1, 2, 3, 4} and 
L = {i|,§,^}, the corresponding recombination probabilities p^, a G L, rj = 1 — pi —pa—ps—pi, 

2222 

and a given initial population pp. Aiming at determining the coeflacient functions cLa{t) for all G L, 
we can immediately write down a0{t) = , (ii{t) = {f] + PiY — rf , aj^t) = {rj + PzY — and 

2 2 2 2 

«/i i\W = ??* - ('7 + Pi)* - ('7 + Pz)* + + +Pz)*, see 

t2'2/ 2 2 22 

If we wanted to determine the remaining coefficient functions aQ{t) for a given time t, they could 
be calculated using the method of Geiringer (i.e. Theorem |3]). But since we aim at a closed 
solution for all t, we use the procedure developed above. To determine the coefficients of Theorem |8l 
we have to calculate the corresponding c{G,H) and c*{G,H). Theorem |6] and [7] imply C/^ = T^, 
^L\{a} = ^L\{a} a £ L, = 2"j^\{q,/3} for aU a,f3 e L, as well as f/| = and 

Us = Ts- Hence, in these cases, the only non- vanishing coefficients are c(L, L) = c{L\{a}, L\{a}) = 

2 2 

c{L \ {a,p},L \ [a, 13}) = c({|}, {|}) = c({f }, {|}) = 1 for aU a,f3eL. It remains to determine 

Ui , Ut and U0 . 

2 2 

1. Constructing Ui: 

The recursion starts with c({i},{i}) = 1. Following .z^^' ({i}, if) = for aU H D {i} 

except for H = {i, |}, and thus the only non-zero c{{^},H), H 3 {i}, is 



ii r 1 5i\ _ ^11 2 J ' i2 ' 2 J/ _ 2 2 

^U2 J' 12' 2 J'' ~ \ \ ~ 4- n n 

Al — A, 1 5, Ps -|- PsPz 



where we have used the recursion (|43p and A 1 = 1 — ps — ps — p?, X, 1 51 = (1 — pa) (1 — Pz). 

2 222't2'2J 2 2 

So, for the transformation (|39|) we obtain 



Ps.Pi 

Ul = Tl+ Tr 1 5, 

2 2 p5 + p3p^ l2'2-f 



SO that Uio<P = [l — ps— ps — p7)Ui. Analogously, 



PiPi 

[/v = Ti H 2-^ 

2 2 p3 + PjPj 



{f.|} 



2. Constructing U0: 

By psp . the only non-vanishing coefficients are c(0,0), c(0,{|}), c(0,{|}), and c(0,{|,|}). 
They are determined by the recursion (|43p and lead to the following transformation (|39|) : 



Pi(P|+Pi) (Pi+P|)Pz PiPz 

U0 =T0 -\ ^ -Ta H ^ \ Ts H — — Tra 5 1 

Pa+PilPs+Pi) 2 P5+(Pi+P3)P7 2 PiPz+Pa+Ps l2'2} 
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Now that we know the c{G,H), the coefficients c*{G,H) are caicuiated via (|45p . Finaiiy, the 
remaining coefficient functions foilow from Theorem [S] 

a^t) = I (A|-A*0) 

^ Pl{P5 + Pl) + P3 ^ 

2 2 2 2 

Pi 

a5{t) = — ' (A| - X%) 

2 2 2 2 

t 

a r 1 3 1 (i) = A r 1 3 1 — A 1 — r (A3 — A0) 

{-2^-2}'^ ' ^2.2) 2 p^(p^ +p^)+p3^ 5 0^ 

2 2 2 2 

^- t t t t 

0.(1 5\{t) = (A/l 51 — \l) T (As — A0) 

t2'2j- PaPz+Ps = P7(P1+P3)+P5 2 



^{f'f}'-^'' P1P7 + P3 + Ps '^■'^'2^ Pi (Ps + Pt) + Pi P7(Pl+P3)+P 



/ (P1+P3)P7 (P5+P7)pi PipZ . 

_l_ / 2 2 2 2 2 2 2 _j 2 2 J 

^ Pi (pi + P3) + Ps Pi (Ps + Pi) + Pi PiPi + Ps + Ps ^ ^ 



2 2 



Pi Pi 

2 2 2 2 2 2 2 

t 

'^Sk L\{t) = A/S 71 — Az , ^ - (As — A0) 

2 2 2 2 

t t ^1 t t ^l+^l t 

O/i 3 51 (<) = Am 3 5\ - A/l 31 (Ar 1 5 1 - Ai ) — Ar 3 5I 

l2'2'2J l2'2'2J l2'2/ P^+PsP? l2'2J 2 Ps+Ps+PlPj l2'2J 



2 2 



Pi Pi 

_j 2 _j 2 

Pl(P5 +P7) +P3 ^ PiIPi +P3) +P5 ^ 

222 2 222 2 

(Pi + P3)PZ (Pl + Pz)Pi PiPZ 

-j^ 2 2 2 2 2 2 _j 2 2 \ 

Pl(Pl +P3) +P5 Pl(P5 +Pl) +P3 PlPl +p3 +P5 

22 2 2 22 2 2 22 2 2 

Pl 



a/l 3 7 1 (t) — A/ 1 3 7 1 — A/ 1 3 1 — A/ 1 71 ^ A/ 3 7 1 + A 

i2'2'2/ l2'2'2/ l2'2/ t2'2/ Ps+PlPs l2'2/ 



2 2 



P3 ^ P3 ^ Pl 

+ r tAI + A*7 



P3+Pl(P5+P7) 2 Pl+PlPl 2 P3 +Pi(p5 

2 2 2 2 2 2 

Ps 

+ A*7 



0,(1 5 71 (t) — A/j_ 5 71 — A/5 71 — A/j_ 71 ^ A/l 51 

t2'2'2/ L2'2'2/ l2'2/ l2'2/ P 5 -\- P 3 P 7 l2'2/ 



2 2 



t t t 

H Ai H -As rA0 

Ps+PaPz 2 P5+P7(Pi+P3) 2 Ps+PiIPi+Pi) 



2 2 



Pa . , Pi+Pi 

\t \ 2 2 \i 



0( 3 5 7 1 (t) — A/l 5 71 — A/5 71 (A/l 71 " A^) A/l si 

t2'2'2j l2'2'2/ l2'2/ P3+P1P5 l2'2j" 2 P3+P5+P1P7 L2'2j 



2 2 



Pl , P5 

-A*i + — -, ^— A*5 



Pl(P5+P7)+Pl 2 P2(pi+p3)+p 

2 2 2 2 2 2 2 

/ (P1+P3)P7 (P5+P7)pi PiPl 

/ -j^ 2 2 2 2 2 2 I 2 2 J 

V P7(Pl+P3)+P5 Pl(P5+P7)+P3 PlPl+Pl+Ps^ 



X* 
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and 



'J/i 3 5 ZX(^) ~ -^/i 3 5 7 \ — A/j_ 3 51 — A / j_ 3 7\ " A / j_ 5 7l " A/ 3 5 7\ + A/ 1 3l 

12'2'2'2/ l2'2'2'2/ \2'2'2/ l2'2'2/ l2'2'2/ l2'2'2/ l2'2/ 

''ft t ^f"*"^! t ''ft 
H -^/i 5"i ii H — ^\ '^■f^ n 

Ps+PsPl L2'2/ l2'2/ Ps+Ps+PlPz t2'2J Ps+PiPb L2'2/ 

222 2222 222 



^ Pa ^ Pi 

^t 2 ^t 2 2 

{f'i} Ps+PsPz ^ Pl(P5+P7)+P3 2 P7(pi+P3)+P5 



Pa 



-A*7 



Pa + PiPa 

2 22 

(P5+Pl)Pl (P1+P3)P7 PiPZ 
2 2 2 2 2 2 I 2 2 

Pl(P5+Pl)+Pa P7(Pl+P3)+P5 P1P7+P3+P5 



where the Aq are given by (j42 
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